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Abstract 

When  using  methods  for  recovering  the  legist- 
squares  optimal  pose+registration  parameters 
between  a  model  and  an  image  suite,  construc¬ 
tion  of  the  feature  correspondences  is  key.  In¬ 
clusion  of  outliers  in  the  correspondence  set  can 
severly  deteriorate  the  performace  and  fidelity 
of  these  pose-h coregistration  estimation  algo¬ 
rithms.  We  consider  the  use  of  median  filter¬ 
ing  in  the  construction  of  consistent  correspon¬ 
dences.  Finally,  we  test  our  coregistration  with 
a  median  filtering  system  on  real  ATR  data. 

1  Introduction 

We  have  been  investigating  sensor  suite  pose  recovery  in 
a  multi-modal  ATR  domain.  Our  formulation  considers 
not  only  the  relative  pose  between  the  model  and  the 
sensor  suite,  but  also  the  constrained  intersensor  regis¬ 
tration.  A  least-mean-square-error  algorithm  has  been 
developed  for  simultaneous  estimation  of  both  pose  and 
registration  (pose-h registration)  [7].  As  a  shorthand,  the 
term  coregistration  is  used  to  describe  this  process. 

Our  coregistration  algorithm  takes  as  input  a  set  of 
correspondences  between  model  features  (points  and 
lines)  and  an  initial  coregistration  estimate.  It  then  uses 
a  non-linear  optimization  procedure  to  arrive  at  a  coreg¬ 
istration  estimate  which  minimizes  a  sum-of-squared- 
error  between  these  corresponding  features.  The  con¬ 
struction  of  the  initial  set  of  corresponding  features  is 
derived  from  initial  expectations  regarding  the  sensor 
registration  and  object  pose. 

As  with  any  least-mean-square-error  fitting  method, 
the  system  is  sensitive  to  outliers  in  the  correspondence 
set.  Local  search  has  been  shown,  using  features  from  op¬ 
tical  imagery,  to  not  only  remove  outliers,  but  to  find  op¬ 
timal  sets  of  corresponding  features  even  when  perhaps 
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only  5%  of  the  total  candidate  features  match  [2].  Our 
long  term  goal  is  an  extension  of  this  work  to  mulitsen- 
sor  matching  using  the  fitting  procedure  defined  in  [7]. 
However,  our  current  reliance  upon  ungrouped  points 
to  represent  range  leads  quickly  to  search  spaces  of  in¬ 
tractable  size.  One  solution  is  to  perform  some  grouping 
on  the  range  data,  and  work  on  this  has  begun.  An¬ 
other  is  to  consider  a  more  conventional  approach  to 
outlier  removal:  median  filtering.  This  paper  presents 
a  least- median- squared-error  extension  of  our  previous 
coregistration  work  and  demonstrates  the  algorithm  on 
real  ATR  data. 

2  Background 

In  our  past  work  [7],  we  presented  a  coregistration  re¬ 
covery  algorithm.  We  hypothesized  that  utilizing  con¬ 
straints  between  the  sensors  allowed  for  a  more  accurate 
pose  estimate  to  be  computed.  We  also  noted  that  data 
in  some  multimodal  domains,  including  ATR,  tends  not 
to  be  boresighted.  Due  to  mechanical  vibrations  and 
torsions,  day-to-day  variations  of  several  pixels  can  be 
expected.  Our  coregistration  takes  these  inaccuracies  in 
the  sensor  registration  into  account,  applying  corrections 
to  the  inter  sensor  translation. 

The  least-squares  algorithm  developed  in  [7]  and  ex¬ 
tended  in  [1]  utilizes  an  iterative  non-linear  optimization 
method.  Such  algorithms  require  an  initial  parameter 
estimate.  The  fidelity  of  this  initial  pose+registration 
estimate  partially  determines  whether  the  coregistration 
algorithm  will  converge  to  the  correct  minimum  on  the 
error  surface.  To  investigate  this  problem,  two  sets  of 
experiments  were  conducted  and  reported  in  [7].  These 
experiments  tested  the  recovery  of  pose+registration  pa¬ 
rameters  given  a  poor  initial  estimate  using  perfect  syn¬ 
thetic  data.  It  was  found  that  the  algorithm  could  re¬ 
cover,  with  high  probability,  from  initial  estimates  up  to 
45®  and  100  meters  off. 

The  second  experiment  utilized  synthetic  data 
with  Gaussian  noise  introduced  and  a  perfect  initial 
pose+registration  parameter  estimate.  This  experiment 
placed  some  bounds  on  the  recoverability  of  the  coreg¬ 
istration  parameters.  We  found  that  the  amount  of  er¬ 
ror  tolerable  in  the  image  is  related  to  the  resolution  of 
the  image.  As  the  standard  deviation  of  the  measure- 


1 


merit  error  increases,  the  convergence  point  tends  to  mi¬ 
grate.  Practically  speaking,  this  implies  that  the  higher- 
resolution  color  data  can  have  greater  pixel-uncertainty, 
while  the  low-resolution  range  data  needs  to  have  rela¬ 
tively  little  pixel-error. 

When  we  attempted  to  coregister  real  ATR  images 
from  the  Fort  Carson  data  set,  we  found  after  con¬ 
siderable  (unpublished)  exploration  that  our  algorithm 
proved  sensitive  to  both  the  geometry  of  the  model  and 
image  features  and  also  to  the  correctness  of  the  corre¬ 
spondence  set.  If  outliers  were  introduced  into  the  cor¬ 
respondence  set,  we  found  convergence  to  neighboring 
minima,  with  large  rotational  errors. 

Some  statistical  methods  have  been  proposed  in  the 
traditional  literature  and  have  been  utilized  by  vision 
researchers  to  construct  outlier- free  correspondence  sets. 
Kumar  [4]  utilized  median  filtering  to  evaluate  his  3-D 
full  perspective  pose  recovery  system  and  here  we  will 
adapt  it  to  coregistration. 

3  Median  Filtering 

Least-squares  methods,  such  as  our  Efu  measure,  as¬ 
sume  that  the  data  has  Gaussian  random  noise  added  to 
it.  If,  however,  the  correspondence  data  contains  out¬ 
liers,  our  method  will  be  thrown  off.  Median  filtering  is 
a  robust  statistic  for  detecting  and  removing  outliers. 

Median  filtering  [6]  handles  outliers  by  fitting  to 
the  subset  of  the  data  which  minimizes  the  ensemble 
median  error  value.  It  is  a  robust  statistic  when  there 
are  less  than  50%  outliers.  This  is  in  contrast  to  the 
mean  around  which  least-squares  algorithms  are  based, 
where  a  single  outlier  can  radically  shift  the  result.  The 
subset  which  minimizes  the  median  error  must  contain 
no  outliers,  otherwise  it  would  skew  the  error,  increasing 
the  median.  And  since  the  median  is  insensitive  to  up 
to  50%  outliers,  so  is  median  filtering. 

The  down  side  is  that,  for  non-differentiable  error 
functions,  a  combinatorial  search  of  the  subset  space 
needs  to  be  explored.  To  approximate  the  complete  com¬ 
binatorial  search,  we  can  select  a  number  of  small  sub¬ 
sets,  assuming  that  we  have  a  high  probability  of  sam¬ 
pling  at  least  one  subset  which  contains  no  outliers.  This 
yields  the  optimal  fit,  and  allows  us  to  throw  out  all  data 
not  accounted  for  by  the  Gaussian  assumption  (ie,  out¬ 
side  of  two  standard  deviations  of  the  best  fit  function, 
since  this  will  contain  98%  of  the  data  effected  by  Gaus¬ 
sian  noise). 

The  subsets  need  to  be  at  least  large  enough  to  cover 
the  degrees  of  freedom,  so  we  would  need  to  select  at  least 
3  optical  lines  and  1  range  point.  However,  Kumar  [4] 
found  that  selecting  a  minimal  number  of  features  caused 
the  solution  to  be  sensitive  to  the  Gaussian  noise  that  we 
assume  is  overlaid  onto  the  true  data.  As  a  consequence, 
it  is  better  to  select  a  larger  subset  to  stablize  the  optimal 
pose  against  noise.  If  we  select  too  large  a  subset  size, 
however,  we  greatly  reduce  our  chances  of  selecting  a 
subset  with  no  outliers.  A  compromise  must  be  made 
between  probability  and  stability. 


Once  we  have  minimized  the  error,  we  need  to  se¬ 
lect  a  cutoff  point,  above  which  we  will  consider  cor¬ 
respondences  to  be  outliers.  We  can  achieve  this  ei¬ 
ther  by  selecting  some  a  priori  threshold  or  by  comput¬ 
ing  one  based  upon  the  median.  We  choose  the  later 
method.  Assuming  a  normal  distribution,  we  can  set 

cutoff  =  (a  X  5)^  where  s  =  is  an  approxima¬ 

tion  of  the  standard  deviation  for  a  Gaussian  distribu¬ 
tion  based'  upon  the  interquartile  range.  Setting  a  to 
2.0  filters  out  data  which  lies  more  than  two  standard 
deviations  above  the  error,  so  that  the  majority  of  the 
Gaussian  data  will  be  retained. 

4  Results  and  Discussion 

Since  we  wish  to  utilize  our  coregistration  recovery  algo¬ 
rithm  as  the  alignment-and-evaluation  component  of  a 
matching  system,  we  need  to  have  a  notion  of  how  our 
current  system  performs  on  real  ATR  data. 

The  initial  set  of  corresponding  model-image  feature 
pairs,  5,  is  selected  based  upon  spatial  proximity  given 
a  hand-picked  initial  pose-hregistration  estimate.  Prox¬ 
imity  thresholds  are  chosen  based  both  upon  the  error 
we  observe  in  the  feature  extraction  and  by  the  percent¬ 
age  of  outlines  permitted  by  median  filtering.  We  use 
the  neighborhood  (3:,  y,  r)  =  ±(0.5, 0.5, 10.0)  in  the  ladar 
data  (x  and  y  are  in  pixels  and  r  is  range  in  meters)  and 
{d,0)  =  ±(30, 15)  in  the  color  data  (where  d  is  the  aver¬ 
age  distance  in  pixels  and  6  is  the  rotational  difference 
in  degrees). 

Median  filtering  is  run  on  300  subsets  of  10  feature 
correspondences  each.  For  random  feature  selections, 
we  normalized  the  selection  so  that  there  is  an  equal 
probability  of  selecting  a  feature  from  either  range  or 
optical  sensor.  If  this  was  not  done,  the  selection  would 
be  biased  towards  the  LADAR  data  (which  accounts  for 
over  95%  of  the  correspondences  s  G  5),  and  the  CCD 
portion  of  the  error  would  often  be  ill-conditioned. 

For  the  coregistration  algorithm,  we  used  weighting 
factors  as  described  in  [1]  which  simply  replace  the  w^c 
and  Wci  of  [7]  with  3  intuitive  factors.  First  is  a  weight¬ 
ing  term  (a)  for  controlling  the  relative  importance  of  the 
sensors.  Since  we  lack  knowledge  about  the  importance 
of  these,  we  set  a  =  0.5.  We  are  also  normalizing  the  in¬ 
dividual  sensor  errors  using  what  amounts  to  the  second 
standard  deviation  of  the  presumed  Gaussian  noise  for 
the  sensor.  We  will  call  these  terms  (optical/CCD) 
and  Tr  (range/LADAR)  and  set  them  to  0.25  and  5  me¬ 
ters  respectively.  These  values  seem  to  intuitively  rep¬ 
resent  the  error  we  are  observing,  though  they  were  not 
found  using  a  formal  error  estimation  process.  In  order 
to  lend  additional  stability,  we  invoke  the  Levenberg- 
Marquardt  rule  not  only  when  the  error  would  increase, 
but  also  when  a  rotation  update  of  greater  than  10®  is 
proposed. 

The  results  of  the  median  filtering  are  given  in  Table  1. 
Median  filtering  takes  on  the  order  of  30  minutes  and  this 
time  is  sensitive  to  the  total  number  of  corresponding 
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(d)  final  model  (e)  final  CCD  (f)  final  LADAR 

Figure  1:  Median  filtering  results  on  Shot  18  (M60) 


Image 

CPU  Time  (sec) 

Correspondences 

Initial  ' 

Final 

Shot  18  (M60) 

1633.90 

461  ^ 

333 

Shot  20  (M113) 

2184.98 

557 

392 

Table  1:  Median  filtering  resuks 


pairs  s  E  S.  For  each  subset,  pairwise  error  for  every 
pair  5  6  5  must  be  calculated,  and  the  pairs  ranked 
in  order  of  ascending  error  ^ .  Table  1  also  shows  that 
roughly  -  of  the  pairs  in  5  are  outliers;  the  numbers 
given  in  the  last  column  indicate  the  number  of  pairs 
determined  not  to  be  outliers. 

Figures  1  and  2  are  initial  and  final  coregistration  es¬ 
timates  for  two  pairs  of  range  and  optical  images.  The 
leftmost  column  shows  the  target  model  itself.  The  mid¬ 
dle  column  shows  features  for  the  optical  imagery.  The 
white  features  in  Figures  lb  and  2b  indicate  the  pro- 

kt  should  also  be  kept  in  mind  that  some  of  the  com¬ 
putation  expense  relates  to  the  C+d-  impleme:atation  of  our 
algorithm.  However,  while  a  more  optimized  version  could 
be  implemented,  the  dependency  on  the  number  of  features 
is  5  would  still  hold. 


jected  3D  target  silhouette.  The  white  features  in  Fig¬ 
ures  le  and  2e  show  both  silhouette  and  image  features 
determined  not  to  be  outliers:  black  indicates  outliers. 
The  image  features  are  found  using  a  model-driven  ap¬ 
proach  described  elsewhere  in  these  proceedings  [5].  The 
rightmost  column  shows  the  range  data.  Black  squares 
are  model  range  points,  grey  squares  are  LADAR  pixels. 
Filled  squares  are  determined  not  to  be  outliers. 

The  overall  global  change  between  initial  and  final 
coregistration  estimates  in  Figures  1  and  2  is  small.  This 
is  a  consequence  of  initializing  the  algorithm  with  a  good 
initial  estimate  and  tight  bounds  upon  the  proximity 
search  used  to  construct  the  set  5.  While  the  change  is 
small,  median-filtering  does  refine  the  estimates  in  each 
case,  and  as  indicated  by  the  numbers  of  outliers  re¬ 
moved  (Table  1),  this  fine  adjustment  is  based  upon  a 
significant  refinement  of  the  correspondence. 

One  characteristic  of  median  filtering  is  that  it  tends 
to  remove  features  on  surfaces  viewed  from  an  oblique 
angle.  Examples  include  the  top  of  the  M60  turret  in 
Figure  1  or  the  roof  of  the  Ml  13  in  Figure  2.  A  probable 
explanation  is  that  such  surfaces  tend  to  have  greater 
sampling  error  in  the  range. 

In  Figure  2e,  note  that  almost  all  CCD  feature  have 
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(a)  initial  model 


(b)  initial  CCD 


(c)  initial  LADAR 


(d)  final  model  (e)  final  CCD  (f)  final  LADAR 

Figure  2:  Median  filtering  results  on  Shot  20  (Ml  13) 


been  identified  as  outliers.  Also  note  that  the  position  of 
the  black  features  suggests  this  is  not  due  to  an  error  in 
the  feature  extraction:  the  features  are  essentially  in  the 
correct  positions.  It  is  believed  that  for  this  example, 
the  large  planar  surface  of  the  Ml  13  fits  the  range  data 
so  well,  that  relative  to  this  fit,  the  CCD  features  are 
considered  to  be  outliers. 

5  Conclusion 

Pose  estimation  methods  which  minimize  the  mean- 
square-error  become  unstable  when  outliers  are  intro¬ 
duced  into  the  correspondence.  We  have  already  in¬ 
troduced  one  such  method  for  simultaneously  recover¬ 
ing  the  pose+registration  parameters  in  [7].  The  previ¬ 
ous  work,  however,  could  not  be  demonstrated  in  a  real 
ATR  domain,  due  to  the  unavailability  of  automatically 
extracted  model  and  data  features  and  the  inability  to 
generate  outlier-free  correspondences. 

As  a  consequence  of  work  done  in  [8,  3],  we  are  now 
able  to  extract  model  and  data  features  on  real  ATR 
images.  In  this  paper,  we  have  constructed  outlier-free 
correspondences  using  median  filtering.  Using  the  real 
ATR.  image  features  and  median  filtering  coregistration, 
we  have  constructed  outlier-free  correspondences.  Due 


to  the  expense  of  the  current  coregistration  implemen¬ 
tation,  the  time  required  to  run  median  filtering  is  rela¬ 
tively  high.  However,  it  does  mark  a  significant  number 
of  initially  considered  correspondences  as  outliers.  We 
have  shown  that  these  filtered  correspondences  do  pro¬ 
vide  stable  coregistration  results. 

References 

[1]  Anthony  N.  A.  Schwickerath  and  J,  Ross  Beveridge.  Coregistration 
of  Range  and  Optical  Images  Using  Coplanarity  and  Orientation 
Constraints.  In  1996  Conference  on  Computer  Vision  and  Patter 
Recognition,  page  (submitted),  San  Francisco,  CA,  June  1996. 

[2]  J.  Ross  Beveridge.  Local  Search  Algorithms  for  Geometric  Obe- 
jct  Recognition:  Optimal  Correspondence  and  Pose.  PhD  thesis, 
University  of  Massachuesetts  at  Amherst,  May  1993. 

[3]  J.  Ross  Beveridge,  Mark  R.  Stevens,  and  N.  A.  Schwickerath.  To¬ 
ward  target  verification  through  3-d  model-based  sensor  fusion. 
IEEE  Transactions  on  Image  Processing,  page  (Submitted),  1996. 

[4]  Rakesh  Kumar.  Model  Dependent  Inference  of  3D  Information 
From  a  Sequence  of  2D  Images.  PhD  thesis,  University  of  Mas¬ 
sachusetts,  .Amherst,  February  1992. 

[5]  Mark  R.  Stevens  and  J.  Ross  Beveridge.  3D  Model  Feature  Pre¬ 
diction  for  Combined  Range  and  Optical  Object  Recognition,  In 
Proceedings:  Image  Understanding  Workshop,  page  (to  appear), 
Los  Altos,  CA,  February  1996.  ARPA,  Morgan  Kaufman. 

[6]  Peter  J.  Rousseeuw  and  Annick  M.  Leroy.  Robust  Regression  and 
Outlier  Detection.  Wiley,  1987. 


4 


[7]  Anthony  N.  A.  Schwickerath  and  J.  Ross  Beveridge.  Model  to 
Multisensor  Coregistration  with  Eight  Degrees  of  Freedom.  In  Pro¬ 
ceedings:  Image  Understanding  Workshop,  pages  481  -  490,  Los 
Altos,  CA,  November  1994.  ARPA,  Morgan  Kaufmann. 

[8]  Mark  R.  Stevens.  Obtaining  3D  Silhouettes  and  Sampled  Surfaces 
from  Solid  Models  for  use  in  Computer  Vision.  Master’s  thesis, 
Colorado  State  University,  September  1995. 


5 


